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Abstract 

We consider the problem of approximating the empirical Shannon entropy of a high-frequency 
data stream when space limitations make exact computation infeasible. It is known that a- 
ON ' dependent quantities such as the Renyi and Tsallis entropies can be estimated efficiently and 

unbiasedly from low-dimensional a-stable data sketches. An approximation to the Shannon 
entropy can be obtained from either of these quantities by taking a sufficiently close to 1. 
However, practical guidelines for the choice of a are lacking. We avoid this problem by going 
directly to the limit. We show that the projection variables used in estimating the Renyi entropy 
can be transformed to have a proper distributional limit as a approaches 1. The Shannon entropy 
can then be estimated directly from a data sketch based on this limiting distribution. We derive 
properties of the distribution, showing that it has a surprisingly simple characteristic function 
{i9) 19 and that the fcth moment of the exponential of such a variable is k k for all non-negative 
real values of k. These properties enable the Shannon entropy to be estimated directly from the 
■ associated data sketch as the logarithm of a simple average. We obtain the Fisher information 

for the statistical problem of recovering the entropy from the data sketch and hence a lower 
bound on the standard error of the estimated entropy. We show that our proposed estimator 
has theoretical statistical efficiency of 96.8% and confirm this with an empirical study. Finally 
we demonstrate that in order for the estimator to have 1 + e coverage with high probability the 
sketch must have size 0(l/e 2 ), in agreement with theoretical bounds. 

1 Introduction 
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ON ■ Streaming data is ubiquitous in a wide range of areas from engineering, and information technology, 



o 

(N 



3 

< 



finance, and commerce, to atmospheric physics, and earth sciences. For background, see 19j, 127] 



cn 

The Shannon entropy provides an important characterisation of a data stream and many algorithms 
have been developed for its estimation [1, [B, [22]. Areas of application extend far beyond that of 
network traffic monitoring, e.g., entropy estimation of neural spike trains, or of images in video 
streams for the purpose of visual tracking. Our own particular interest is in developing conver- 
gence diagnostics when monitoring extensive Markov chain Monte Carlo simulations of posterior 
distributions in Bayesian inference 

This paper focuses on the estimation of Shannon entropy by a-stable data sketching, i.e., trans- 
forming distinct stream elements online to distinct realizations of a strictly stable variable of index 
a (called a-stable hereafter), and storing weighted linear combinations of these realizations, inde- 



pendently replicated k times 18]. There is a l ong history of the use of the a-stable distribution 



in the literature, e.g^ in cardinality estimation [id. 13. fiol. 13 . 16,0], norm and distance estimation 
[2, 11, 15, 2(3, 3, 23, iil, and, more recently, entropy estimation 2^, 12]. However none of these 
projection methods estimates the Shannon entropy directly. In contrast, they estimate alternative 
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a-dependent quantities such as the entropies of Renyi and Tsallis, and rely on these measures 
providing adequate approximations to the Shannon entropy when a is close to 1. 

Our contribution is to provide an easily implemented direct solution by forming a data sketch 
with one of the simplest stable distributions, the maximally skewed distribution with a = 1. We 
show that this distribution arises in the limit when considering the a-stable transformations involved 
in data sketches for the Renyi entropy. We derive this result in Section [2j and obtain a compact 
representation of the characteristic function of the limiting distribution along with moments of the 
exponential of a random variable with this distribution. The simple form of these moments is then 
exploited in Section [3. II to derive a family of log-mean estimators of the Shannon entropy, indexed 
by £ > 0. We show that the optimal estimator is given by £ = 1.15 and has asymptotic efficiency 
of 97.8% in the sense that its variance is within 2.2% of the theoretic Cramer-Rao lower bound 
for the variance of any estimator based on the same sketch. The simple estimator with £ = 1 
has asymptotic efficiency of 96.8%. We provide a table of small-sample bias corrections for both 
estimators in Section 13.11 and confirm the efficiency results with an empirical study in Section 13.21 
Finally in Section 13.31 we establish that the probability of the estimator having error greater than 
e decreases exponentially with k, where k is the length of the data sketch, when £ < 1. For small 
e it decreases exponetially with ke 2 which leads to a sample complexity bound, in the style of 

. on the length k of the data sketch, and hence on the storage requirements of the algorithm 
implementing this estimation procedure. 



1.1 Notation 

We follow the notation and terminology of fl~8l . [Io| . Let St denote a data stream of length T, 
with elements of the form (it,dt), where the item type it belongs to a large or possibly infinite set 
V = {ci, C2, • • • , cat} and the associated quantity is dt (either positive or negative), t = 1, 2, . . . , T. 

Definition 1.1. The accumulation vector of St at stage T is slt = ( a i> a 2, ■ ■ ■ ), where 

T 

a j = Y / d t I(i t = c 3 ), j = l,2,...,N, (1) 
t=i 

is the cumulative quantity of elements of type Cj at stage T (the indicator function I(it = Cj) equals 
1 if it = c j an d otherwise). 

Note that we are assuming that the data types can be ordered by some convention, e.g., by the 
order of their first appearance. We also assume that at stage T. each of the terms aj is non-negative, 
This is the relaxed strict-turnstile model introduced by Li [101 ] . 

The empirical entropies of Shannon, Renyi and Tsallis are then given respectively by 

N N N 

j=i j=i j=i 

where pj = aj/ a i an d by convention p log(p) is defined to be when p = 0. When the number 
of positive terms in the accumulation vector is large, storing &t or equivalently (pi,P2, ■ ■ ■ } Pn) 
may be infeasible. This has motivated the use of data sketches and other synopsis construction 
techniques that store and update online a low dimensional representation of the stream [l|. 
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2 Estimating entropy via random projections 



Random projection methods require that each element type Cj in the data stream can be trans- 
formed into a distinct random variable R(cj). In practice, this is achieved "to adequate approx- 
imation" by (i) hashing Cj to an integer (or vector of integers), (ii) using these integers to seed 
a pseudo-random number generator, and (iii) using the seeded generator to simulate the random 
variable R(cj). The projection is then accumulated online as Ylt=i R{h)dt = Ylf=i R{cj)aj. This 
provides a single element of the data sketch. A further k — 1 elements are generated independently 
in parallel to form the /c-dimensional sketch. 

Special properties of the a-stable distributions [28| motivate their use in data sketching. For 
example, the a-frequency moment ^2f = i a? can be recovered approximately from an a-stable data 



sketch since Ylf=i a jR( c j) nas the same distribution as R(Ylj'=i a 'j) 1 ^ a ^ where R and R(c 
1,2, ... ,N, independently, each have the same a-stable distribution Dividing by YljLi a j = 
Ylt=i dti ^ ne total cumulative quantity of all items in the data stream at stage T, we have a similar 
distributional identity involving B a = (X^^Li P'j an d hence this quantity can be estimated from 
a data sketch in the same way that a scale parameter can be estimated in an observed sample of 
size k from an a-stable distribution. Estimates of the Renyi and Tsallis entropies are then obtained 
by substituting the estimated value of B a in ([2]), and by choosing a close to 1; these quantities 
provide an approximation to the Shannon entropy 22| as shown by the following result. 



Lemma 2.1. As a — > 1, 



S a (p)^0, H a (p)^0. 



1 — a 



1 — a 
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The second limit is found similarly. 



□ 



We now show that the limiting process can be carried out within the family of a-stable variables 
leading to a projection variable that enables the Shannon entropy to be estimated directly. Suppose 
the Z a is a positive, strictly stable random variable with index < a < 1, having Laplace transform 



(i) 



e" A for A > 0. Let (Z£ 
be a vector of frequencies with YliLi P 



Z^) be a vector of independent copies of Z a and let p = (p\ 

N 



,PN) 



N 



Pi 




Z n B n 



(3) 
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where the symbol ~ denotes equality in distribution, and Z a — > 1 as a — * 1 (shown in proof of 
Lemma I2.2p . 

Starting from the Renyi entropy H a (p) = a/(l — a) log B a , we obtain that (1 — Ba)/ (1 — a) 
(1 - a)- 1 [l - exp{(l - a)H Q (p)/a}] -» 5 as a — > 1 by Lemma 12.14 where —5 
the Shannon entropy of p. 



Next, we define 

N N 



(1 — Zq : )/(l — a) + log(l — a), and, using ([3j), we obtain 



8=1 



i=l 



7 (i) 



so that taking limits ^i'Pi ~ Y± -\- 8, provided has a proper limit as a — ► 1. This is 

established in the following lemma. 

Lemma 2.2. The random variable Y a has a proper limit Y± as a — > 1. The characteristic function 
(f)(6) ojY\ is (id) ld , and the kth moment of the random variable exp(Yi) is k k for all k > 0. 

See Appendix for proof, where we show that Y% has a maximally skewed stable distribution with 
a = 1, and obtain the characteristic function in the equivalent form (f)(6) = exp(— ^vr|0| +i6log \6\). 
Denoting the distribution function of Y\ + 5 by G(x; 5) we have the following. 



a 



+ log(l - a) 



1-Zn 



a 



+ log(l - a) 



+ Z t 



(l-B a 



a 



(4) 



Lemma 2.3. Let X\, . . . ~ G(x;0) be independent random variables, and let pi, 
positive constants satisfying ^2f = iPj = 1. Then, 

N N 

Z~2 p i X i ~ G { x > J2 p i log ^') • 
i=i i=i 

Proof. Straightforward using Lemma 12.21 and the fact that ^2f = iPj = 1. 



□ 



Thus by projecting to maximally skewed stable random variables with distribution function 
G(x; 0), i.e., R(cj) ~ G(x; 0), and storing projection elements of the form Ylt=i Rfytjdt, we reduce 
the problem of recovering the Shannon entropy to that of estimating a location parameter. 



3 The log-mean estimator 
3.1 Derivation 

The following lemma introduces a family of log-mean estimators of 5 indexed by £, where —5 is 
the Shannon entropy. See Appendix for proof. We qualify the performance of these estimators in 
terms of asymptotic relative efficiency (ARE), defined as the ratio of the variance of the limiting 
distribution of the maximum likelihood estimator (MLE) to the variance of the limiting distribution 
of the estimator in question, as the sample size increases [2l|]. The former attains the theoretical 
lower bound called the Cramer-Rao lower bound, so the ARE compares the performance of the 
estimator to the best possible performance. For practical purposes the simplest of these estimators 
with £ = 1 is to be recommended, and we begin by describing it below. 

Let yi,...,yk be independent samples from the G(y;5) distribution. Simulating from the 
maximally skewed stable distribution G(y;0) via the algorithm of Chambers et al. [a] shows that 
this distribution has very heavy negative tails, so we exponentiate to flatten them. 
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Consider the transformation Wj = exp(yj) = exp(<5 + Zj), where Zj ~ G(z;0) independently, 
j = l,...,k, with characteristic function (f)(9) = Kexp(iOzj) = (i9) 1 ® , for 9 6 R. It follows that 
Ewj = e S (j)(—i) = e 5 , so k^ 1 Ylj=i ex P(%) i s unbiased for e s and has variance 3e 25 jk. The Cramer- 
Rao lower bound for estimating 5 is approximately (0.3445) -1 [121] . so the Fisher information about 
e 5 is 0.3445 exp (25). Hence, the ARE of the mean estimator to the MLE is (0.3445 x 3)- 1 « 0.968. 
By taking the logarithm, we obtain the log-mean estimator of 5: log (k^ 1 X^i=i ex P(yj)) • Lemma F3. II 
presents the family of log- mean estimators indexed by £ > 0, where Wj = exp(£yj). 

Lemma 3.1. Let y\,...,yj t be independent samples from the G(y;5) distribution, and £ > 
constant. The bias- corrected, log-mean estimator of 5 is 

k 

Sim, bc = r 1 log (r f at 1 ex p(Cy?)) - bc, 

i=i 

where BC is the additive bias in small samples: 

k 

BC = C^Elog (C^- 1 J>xp(C^-)), 

3=1 

and ~ G(z;0) i.i.d. For ^ = 1.15, the estimator is near-optimal with largest ARE of 0.978; for 
Q = 1.0, the estimator has ARE of 0.968. 

The log-mean estimator has finite variance in small samples for all values of Q > 0, but it has 
exponentially decreasing tail bounds only for Q < 1. See Sections 13.21 and 13.31 for details. When 
C < 1, the maximum ARE of 0.968 is attained at Q — 1. In other words, 5i m ^ c with C = 1 is 96.8% 
as efficient as the best possible estimator, the MLE. 

Table [1] presents the small-sample bias for £ = 1 and £ = 1.15, and various sample sizes k, 
approximated via simulations. The standard error, computed as the sample standard deviation 
divided by y/n, where n is the number of repetitions, appears in brakets. 



3.2 Small sample performance 

Figure [T] compares the performance of the log-mean estimator in terms of relative mean square error 
(MSE) in small samples, alongside the Cramer- Rao lower bound given by (k x 0.3445) -1 . The MSE 
is defined as the ratio of E (pi m ^ c — 5) to 5 2 , where the expectation is approximated via simulations. 
The log-mean estimators with £ = 1 and ( = 1.15 have very good small sample performance for 
sample sizes k > 20. Furthermore, the difference in MSE between the two estimators is negligible. 

Figure [2] compares the Renyi entropy estimator with a = 0.98 using the optimal quantile 
estimator for the l a quasi- norm; this is the recommended estimator in [22J. Following [7], we 
obtain the Cramer-Rao lower bound, and plot it for comparison. As an approximate estimator of 
entropy, the Renyi estimator has good small sample performance for k > 20. However, it is an 
estimator of entropy only in the limit as a — > 1, whereas the log- mean estimator approximates the 
entropy directly. For a > 0.98, we encountered numerical instabilities in the R contributed package 
fBasics, used to simulate from the stable distribution, and to approximate the density, distribution, 



and quantile functions. These instabilities were also pointed out in 24] 
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k 


C = i 


C = 1.15 


k 


C = i 


C = 1.15 


10 


-0.1617 

(8.8584 x 10™ 4 ) 


-0.1810 
(8.529 x 10" 4 ) 


90 


-0.01662 
(2.605 x 10" 4 ) 


-0.01886 
(2.590 x 10" 4 ) 


20 


-0.07795 
(5.745 x 10" 4 ) 


-0.08792 
(5.710 x 10" 4 ) 


100 


-0.01514 
(2.470 x 10~ 4 ) 


-0.01719 

(2.457 x 10" 4 ) 


30 


-0.05113 
(4.616 x 10" 4 ) 


-0.05786 
(4.588 x 10" 4 ) 


110 


-0.01316 
(2.355 x 10" 4 ) 


-0.01504 
(2.342 x 10" 4 ) 


40 


-0.03857 
(3.973 x 10" 4 ) 


-0.04365 
(3.950 x 10" 4 ) 


120 


-0.01278 
(2.253 x 10" 4 ) 


-0.01451 
(2.241 x 10" 4 ) 


50 


-0.03060 
(3.530 x 10" 4 ) 


-0.03469 
(3.509 x 10" 4 ) 


130 


-0.01170 
(2.163 x 10~ 4 ) 


-0.01326 
(2.151 x 10" 4 ) 


60 


-0.02501 
(3.216 x 10" 4 ) 


-0.02841 
(3.197 x 10" 4 ) 


140 


-0.01070 
(2.083 x 10" 4 ) 


-0.01216 
(2.072 x 10" 4 ) 


70 


-0.02170 
(2.967 x 10" 4 ) 


-0.02459 
(2.950 x 10" 4 ) 


150 


-0.009971 
(2.008 x 10" 4 ) 


-0.01133 
(1.997 x 10~ 4 ) 


80 


-0.01851 
(2.766 x 10" 4 ) 


-0.02109 
(2.749 x 10" 4 ) 









Table 1: Approximate small-sample bias BC for the log-mean estimator computed over n = 5 x 10 5 
replicates. The standard error of the approximations appears in brakets. 




10 20 30 40 50 60 70 ° 80 90 100 110 120 130 140 150 

Sample size (k) Sample size (k) 



Figure 1: Comparison in terms of MSE of the log- mean estimators 5i m ,bc with Q = 1 and Q = 1.15 
(10 5 replicates), alongside the Cramer-Rao lower bound. 
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Figure 2: Comparison in terms of MSE of the Renyi entropy estimator with a = 0.98 [22| (10 5 
replicates), to the Cramer- Rao lower bound. 

3.3 Tail bounds 

The length of the data sketch vector, k, is determined by the behaviour of the tail bounds. Given 
arbitrary parameters e > 0, and < 7 < 1, we require that the estimation error be bounded 
as follows: P (\di m — S\ > e) < 7, where 5i m is the estimator without the small-sample bias re- 
moved. Note that the absolute estimation error is independent of 5 since the bias of 5i m is additive. 
Lemma 13.21 shows that for £ < 1, the log-mean estimator has exponentially decreasing tail founds. 
See Appendix for proof. 

Lemma 3.2. Exponentially decreasing tail bound exist for £ < 1 and arbitrary e > 0, with 

P {Sim ~ ^ > e) < exp ( - fc^-) , and P (5 im - $ < -e) < exp ( - k-j^-J , 
where Gr = e 2 / sup 4>0 Q^(t, e) , Gl = e 2 / sup <>0 (— i, — e) and 



Q f (t,e) = -log +teC 



7 

i=0 J 



Furthermore as e — > 6oi/i G/j and Gx tend to 2(4^ — 1)/C 2 - 

A sample complexity bound in the style of [2^] follows, stating that it suffices to let k be of 
order 0(1/ e 2 ) for the absolute error to be at most e with probability exceeding 1 — 7. This result 
agrees with theoretical lower bounds for similar estimation problems 18, 20, 23, 25| . 



Figure [3] plots approximations to the left and right tail bound constants for various e, showing 
that these constants are small. 
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Figure 3: Tail bound constants Gr (left) and Gl (right) in Lemma [3721 for various e values approx- 
imated using Maple. 

4 Conclusion 

The Shannon entropy of a data stream is a useful summary statistic in analyzing the evolving 
nature of the data, and has found many areas of application. The quest for efficient and accurate 
estimation in a one-pass algorithm using small time and space requirements is motivating ongoing 
research. We propose the log-mean estimator of entropy based on the method of data sketching via 



hashing to maximally skewed a-stable random variables. Our approach is similar to that in 22] 
based on compressed counting, and estimating the entropy in the limit as a. — > 1, but we estimate 
the entropy directly to the limit, avoiding the problem of how to choose a in practice. 

The data is processed online, in a one-pass algorithm, storing and updating a sketch vector of 
length k. Each item type is hashed to a maximally skewed a-stable random variable with a = 1, and 
a linear combination, weighted by the associated quantity, is updated; this processing is performed 
k times via independent hash functions. The running time is 0(k) per data element, assuming each 
hashing operation requires constant time. The storage requirement for the data sketch is 0(k). 

We derive the characteristic function of the maximally skewed stable distribution with a = 1, 
and the moments of the exponential of such a variable. Exploiting these properties, we derive a 
family of log-mean estimators of the Shannon entropy indexed by and recommend the estimator 
with £ = 1. We explain that this estimator is near optimal, having asymptotic relative efficiency of 
96.8%, and show empirically that it has good performance in small samples. Finally, we prove that 
for small e, the error probability of this estimator decreases exponentially with ke 2 which leads to 
a sample complexity bound on k, and on the storage requirements of the algorithm implementing 
this estimation procedure. 
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Appendix 

Following [3], the stable distribution has four parameters: index a S (0,2], skewness parameter 
(3 G [—1,1], location parameter 6 G R, and scale parameter 7 > 0, denoted by F(x\ a, (3, 7, 5). If X 
has distribution F(x; a, /3, 7, <5), then its characteristic function 4>(9) (c.f.), 9 G R, is given by 

m - «-p(^ - { - & Lif :;Kf + °$ + at) " i i, < 5 > 

where E denotes expected value, and i = \f— T. If /? = ±1, the distribution is called maximally 
skewed. 

Proof of Lemma \2.2i The c.f. of Z a can be written as 

= exp {-\9\ a cos(vra/2) + i\9\ a sgn(9) sin(7ra/2)} , 



where j a = cos(-7ra/2) in ([5]), and sgn(0) = 9/\0\ for 0^0, and otherwise. As a — ► 1, </>(0) — ► 
exp(i0), so lim a ^i Z a = 1. It follows that the limit Y\ = lim Q ^i Y a exists. 

Next, we show that the moment generating function (m.g.f.) of Y a , defined by Eexp(0Y Q ), 
exists for 9 > 0, and use a result in [26| to conclude that the c.f. of Y a equals the m.g.f. evaluated 
at %9. For 9 > 0, 

Eexp{0Y Q } = (1 - a) e e e/(1 - a) Eexp{-9Z a /{l - a)} = (1 - a) 6 , e -[e/(i-«)] Q +e/(i-or) ) (6) 

where the last equality is given by the Laplace transform of Z a . So, the c.f. of Y a is then given by 

exp {i0[l/(l - a) + log(l - a)] - [*0/(l - a)] a }. (7) 

Letting a — > 1 in ([7]), we have the desired limit. The moments of exp(Y Q ) follow by taking limits 
as a — > 1 in flBD< 

□ 

Proof of Lemma \ 3. 11 Consider the following transformation: Wj = e^ Vj = e^ s+Zj ^ = e^e^ Zj , where 
Zj ~ G(z;0) i.i.d. have characteristic function (f)(9) = Eexp(i0Zj) = (i9) t6 , for 9 G R. Then, from 
Lemma 12.21 Ewj = e^ 5 (^. Let r; = e^ s . The estimator 17 = C~^ -1 X^=i ^ s unbiased for r/ and 
has variance r/ 2 A; _1 (4^ — 1). Moreover, by the Central Limit Theorem, as k —> 00, 

Vk-q~ l {fi - 77) -> Normal(0,4 c - 1). (8) 

The log-mean estimator of 5 is 



4n = C 1 ^g /) = C 1 log U C k ex P(C?/. 



By the Delta Method applied to @, as A; -> 00, \/&(<5/ m - <5) Normal (0,C~ 2 (4 C - 1)). The 
small-sample bias of <5/ m equals 

k 

E<5 im - (5 = C^Elog (C-^- 1 ^exp(C^) 

i=l 
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Finally, we want to find the optimal value of £ that maximizes the ARE of Si m relative to the 
MLE of 5. The Cramer-Rao lower bound for estimating 5 is approximately (0.3445) -1 , so the ARE 
is C 2 /[0- 3445(4^ — 1)]. This is a concave function that attains a maximum value of 0.978 when 
C ps 1.15. When ( = 1.0, the ARE evaluates to 0.968. □ 



Proof of Lemma \3.SX For e > and t > 0, 

F{5 lm -5>e) =p(r f ^ 1 ^exp(Cz j ) >e Ce ) < exp { - ^ £ }Eexp { ^ -^}, 



by the Chernoff bound 13], provided the right hand side converges. Define Tj = t£ *e^* J ', j = 
1, . . . , fc. Then, 

E » P E^( E e Xpm ,)^{Ef}^{Efr 

j=i S=o J ' J S=o J ' J 

By the Ratio Test, the series is absolutely convergent for allt > if < £ < 1, and for < t < e~ x if 
C = 1. If C > 1, the series is divergent. Define T = {t;t > 0} for < £ < 1, andT = {t;0 < t < e -1 } 
for ( = 1. It follows that, if £ < 1, then <5/ m has an exponentially decreasing right tail bound that 
satisfies 

/ e 2 \ 

P [Sim - 5 > e) < exp ^ - A;— J , 

where 



(9) 

•j=0 

It is straightforward to show that the function maximized in (|9|) is concave. The result follows 
similarly for the left tail bound. Furthermore by expanding the series in Q for small values of t 
we can show that as e — ► both Gr and Gl converge to 2(4^ - 1)/C 2 - The details are as follows. 
Define 



m) = E 

and consider 



j=0 J 



K c (s,e) = (M c (se)exp(-see Ce )J , s > 0. 



K^(s,e) is a convex function [17;], so it follows that inf s >o K^(s, e) — ► inf s >o -K£(s), where i££(s) is 
the pointwise limit of K^(s, e) as e — ► 0, provided this limit exists. 
Furthermore, since 1/Gr = — log ( inf s> o K^{s, e)) , it follows that 



lim G R 



e^o log(inf s>0 ^(s))' 
To establish the pointwise limit, first note that if se £ T, then 



j'=3 J ' i=3 J ' 
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So that expanding in powers of e, we have that 



log(K c (s,e)) = 

s 2 



log ( 1 + se + + o(e 2 ) ) - se(l + (e) + o(e 2 ) 

{4< - 1} = K* c (s) 



2 

Differentiating with respect to s, we obtain 

C 2 

inf KUs) = - — — -, 
s >o c w 2(4C - 1)' 

as required, where the convexity ensures a unique minimum. □ 
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